Output-sensitive Skyline Algorithms in External Memory
نویسندگان
چکیده
This paper presents new results in external memory for finding the skyline (a.k.a. maxima) of N points in d-dimensional space. The state of the art uses O((N/B) log M/B(N/B)) I/Os for fixed d ≥ 3, and O((N/B) logM/B(N/B)) I/Os for d = 2, where M and B are the sizes (in words) of memory and a disk block, respectively. We give algorithms whose running time depends on the number K of points in the skyline. Specifically, we achieve O((N/B) log M/B(K/B)) expected cost for fixed d ≥ 3, and O((N/B) logM/B(K/B)) worst-case cost for d = 2. As a side product, we solve two problems both of independent interest. The first one, the M -skyline problem, aims at reporting M arbitrary skyline points, or the entire skyline if its size is at most M . We settle this problem in O(N/B) expected time in any fixed dimensionality d. The second one, the M -pivot problem, is more fundamental: given a set S of N elements drawn from an ordered domain, it outputs M evenly scattered elements (called pivots) from S, namely, S has asymptotically the same number of elements between each pair of consecutive pivots. We give a deterministic algorithm for solving the problem in O(N/B) I/Os.
منابع مشابه
A Study on External Memory Scan-Based Skyline Algorithms
Skyline queries return the set of non-dominated tuples, where a tuple is dominated if there exists another with better values on all attributes. In the past few years the problem has been studied extensively, and a great number of external memory algorithms have been proposed. We thoroughly study the most important scan-based methods, which perform a number of passes over the database in order ...
متن کاملSkyline Computation with Noisy Comparisons
Given a set of n points in a d-dimensional space, we seek to compute the skyline, i.e., those points that are not strictly dominated by any other point, using few comparisons between elements. We study the crowdsourcing-inspired setting ([FRPU94]) where comparisons fail with constant probability. In this model, Groz & Milo [GM15] show three bounds on the query complexity for the skyline problem...
متن کاملFaster output-sensitive skyline computation algorithm
a r t i c l e i n f o a b s t r a c t We present the second output-sensitive skyline computation algorithm which is faster than the only existing output-sensitive skyline computation algorithm [1] in worst case because our algorithm does not rely on the existence of a linear time procedure for finding medians.
متن کاملDissertation Defense Efficient and Adaptive Skyline Computation
Abstract: Skyline, also known as Maxima in computational geometry or Pareto in business management field, is important for many applications involving multi-criteria decision making. The skyline of a set of multi-dimensional data points consists of the points for which no other point exists that is better in at least one dimension and at least as good in every other dimension. Although skyline ...
متن کاملExternal Memory Algorithms for String Problems
In this paper we present external memory algorithms for some string problems. External memory algorithms have been developed in many research areas, as the speed gap between fast internal memory and slow external memory continues to grow. The goal of external memory algorithms is to minimize the number of input/output operations between internal memory and external memory. These years the sizes...
متن کامل